Update XML Views 



Jixue Liu 1 Chengfei Liu 2 Theo Haerder 3 Jeffery Xu Yu 4 

1 School of Computer and Information Science, University of South Australia 

email: {jixue. liu}@unisa. edu.au 
2 Faculty of ICT, Swinburne University of Technology 
email: cliu@swin.edu.au 
3 Dept of Computer Sciences, Technical University of Kaiserslautern 
email: haerder@informatik.uni-kl.de 
4 Dept of Systems Eng. and Eng. Management, Chinese University of HK 

email: yu@se.cuhk.edu.hk 

Completed 2011 June 



Abstract 

View update is the problem of translating an update to a view to 
some updates to the source data of the view. In this paper, we show the 
factors determining XML view update translation, propose a translation 
procedure, and propose translated updates to the source document for 
different types of views. We further show that the translated updates are 
precise. The proposed solution makes it possible for users who do not 
have access privileges to the source data to update the source data via a 
view. 

keywords: XML data, view update, update translation, virtual views 



1 Introduction 

A (virtual) view is defined with a query over some source data of a database. 
The query is called the view definition which determines what data appears in 
the view. The data of the view, called a view instance, is often not stored 
in the database but is derived from the source data on the fly using the view 
definition every time when the view is selected. 

In database applications, many users do not have privileges to access all the 
data of a database. They are often given a view of the database so that they can 
retrieve only the data in the view. When these users need to update the data 
of the database, they put their updates against the view, not against the source 
data, and expect that the view instance is changed when it is accessed next 
time. This type of updates is called a view update. Because of its important 



use, view update has a long research history [TJ [51 [TOl QTJ [SJ El H2] . The work 
in [3] discusses detailed semantics of view updates in many scenarios. 

Unfortunately, view updates cannot be directly applied to the view instance 
as it is not stored physically and is derived on the fly when required (virtual 
view). Even in the cases where the view instance is stored (materialized view), 
which is not the main focus of this paper, applying updates to the instance may 
cause inconsistencies between the source data and the instance. To apply a view 
update to a virtual view, a translation process is required to translate the view 
update to some source updates. When the source data is changed, the data in 
the view will be changed next time when the view is selected. To the user of the 
view, it seems that the view update has been successfully applied to the view 
instance. 

Let V be a view definition, V % the view instance, S % the source data of 
the view, V{S % ) the evaluation of V against S\ Then V 1 = V(S l ). Assume 
that the user wants to apply a view update SV to V 1 as SV(V l ). View update 
translation is to find a process that takes V and SV as input and produces a 
source update SS to S l such that next time when the user accesses the view, 
the view instance appears changed and is as expected by the user. That is, for 
any S* and V 1 = V(S l ), 

V(SS(S 1 )) = SV(V l ) (1) 

Two typical anomalies, view side-effect and source document over-update, 
are easily introduced by the translation process although they are update pol- 
icy dependent [8j. View side-effect [12] is the case where the translated source 
update causes more-than-necessary change to the source data which leads to 
more-than-expected change to the view instance. View side-effect makes Equa- 
tion ([I]) violated. 

Over-updates may also happen to a source document. An over- update to a 
source document causes the source data irrelevant to the view to be changed, 
but keeps the equation satisfied. A source document over-update is incorrect as 
it changes information that the user did not expect to change. 

A precise translation of a view update should produce source updates that 
(1) result in necessary (as the user expects) change to the view instance, (2) 
do not cause view side-effect, and (3) do not cause over- updates to the source 
documents. 

In relational databases, extensive work has been done on view update and 
the problem has been well understood [TJ [SJ [TU] . In cases of updating XML 
views over relational databases, updates to XML views need to be translated to 
updates to the base relational tables. The works in H2] propose two different 
approaches to the problem. The work in [3] translates an XML view to some 
relational views and an update to the XML view to updates to the relational 
views. It then uses the relational approach to derive updates to the base tables. 
The work in [T2] derives a schema for the XML view and annotates the schema 
based on keys of relational tables and multiplicities. An algorithm is proposed 
to use the annotation to determine if a translation is possible and how the 
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translation works. Both works assume keys, foreign keys and the join operator 
based on these two types of constraints. Another work, technical report [5], 
proposes brief work on updating hypertext views defined on relational databases. 
To the best of our knowledge, the only work relating to XML view update is 
[7] which proposes a middle language and a transformation system to derive 
view instance from source data, and to derive source data from a materialized 
view instance, and assumes XQuery as the view definition language. We argue 
that with the view update problem, only view updates are available but not the 
view instance (not materialized). Consequently view update techniques are still 
necessary. 

In this paper, we look into the view update problem in pure XML context. 
This means that both source data and the view are in XML format. We as- 
sume that base XML documents have no schema and no constraints information 
available. 

The view update problem in the relational database is already difficult as 
not all view updates are translatable. For example, if a view V is defined by a 
Cartesian product of two tables R and S, an update inserting a new tuple to the 
view instance is not translatable because there is no unique way to determine the 
change(s) to R and S. The view update problem in XML becomes much harder. 
The main reason is that the source data and view instances are modeled in trees 
and trees can nest in arbitrary levels. This fundamental difference makes the 
methods of translating view updates in the relational database not applicable to 
translating XML view updates. For example, the selection and the projection 
in the relational database do not have proper counterparts in XML. The view 
update problem in XML has many distinct cases that do not exist in the view 
update problem in the relational database (see Sections [3] and [5] for details). To 
the best of our knowledge, our work is the first proposing a solution to the view 
update problem in XML. 

We notice that the view update problem is different from the view mainte- 
nance problem. The former aims to translate a view update to a virtual view 
to a source update while the latter aims to translate a source update to a view 
update to a materialized view. The methods for one do not work for the other. 

We make the following contributions in this paper. Based on the view defini- 
tion and the update language presented later, we identify the factors determining 
the view update problem. We propose a translation algorithm to translate view 
updates to source updates. Furthermore, we propose translated updates to the 
source for different types of view updates. The types of view updates range 
from the case where the update involves an individual tree selected the source, 
the case where the update involves multiple trees from the source, and the case 
where the update happens to the root of the view. For each proposed update 
to the source, we prove that it is precise. 

The paper is organized as follows. Section [2] shows the view definition lan- 
guage, the update language, and the prcciscncss of view update translation. 
In Section [3j we propose an algorithm and show that the translation obtained 
by the algorithm is a precise translation. In Section |1J we identify a 'join' case 
where a translated update is precise. Section [5] shows a translation when a main 
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subtree of the view is deleted. Section [6] concludes the paper. 

2 Preliminaries 

In this section, we define basic notation, introduce the languages for view defi- 
nitions and updates, and define the XML view update problem. 

Definition 1 (tree) . An XML document can be represented as an ordered tree. 
Each node of the tree has a unique identifier Vi , an element name ele also called 
a label, and either a text string txt or a sequence of child trees Tj 1 , - ■■ , Tj n . 
That is, a node is either (vi : ele : txt) or (v t : ele : Tj 1 ,- ■ ■ ,Tj n ). When the 
context is clear, some or all of the node identifiers of a tree may not present 
explicitly. A tree without all node identifiers is called a value tree. Two trees 
T\ and T 2 are (value) equal, denoted by T\ = T2, if they have identical value 
trees. If a tree T\ is a subtree in T 2l T\ is said in T 2 and denoted by T% G T%. 
□ 

For example, the document <root><A><B>l</B></A><A><B>2</B></Ax/root> 

is represented by T — (v r :root; (v :A:(vi:B:l)), (v 2 :A:(v 3 :B:2))). The value 
tree of T is (root: {A:{B:1)),{A:{B:2))). 

Definition 2. A path p is a sequence of element names ei/e 2 / ■ ■ ■ je n where 
all names are distinct. The function L(p) returns the last element name e n . 

Given a path p and a sequence of nodes v%, ■ ■ ■ , v n in a tree, if for every node 
Vi € [«2, • • • , v n ], Vi is labeled by and is a child of then v\j ■ ■ ■ /v n is a 
doc path conforming to p and the tree rooted at v n is denoted by T£ n . □ 

2.1 View definition language 

We assume that a view is defined in a dialect of the for-where-return clauses 
of XQuery 0. 

Definition 3 (V). A view is defined by 
<v>{ for x\ in p\, x n in p n 

where cdn(x\, ■ ■ ■ , x n ) 
return rtn(xi, ■ ■ ■ , x n ) }</v> 

where pi, ■ ■ ■ ,p n are paths (Definition [2]) proceeded by doc() or xf, 

cdn(xi, • • • , x n ) ::= Xi/Ei — xj/Sj and • • • and Xk/£k = strVal and ■ • •; 
rtn(x u - ■ ■ ,x n ) ::= <e> {x u /j u } ■■■ {x v /j v } </z>; 

"f,£ are paths, and the last elements of all x u /^/ Ul ■ ■ ■ ,x v /j v are distinct. 

□ 

We note that the paths in the return clause are denoted by Xi/^s because 
these expressions are specially important in view update translation. We pur- 
posely leave out the $ sign proceeding a variable in the XQuery language. 

Definition 4 (context-based production). By the formal semantics of XQuery 
[5] , the semantics of the language is 
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for Xi in p\ return 
for 22 in Vi return 

for x„ in p n return 

if cdn(x 1; 2„)=true 
return rtn(xi, x n ) 

The for-statement produces tuples <xi, x n >, denoted by fortup(V), where 
the variable Xj represents a binding out of the sub trees located by pi within 
the context defined by Xi, • • • , x,_i. This process is called context-based pro- 
duction. □ 

For each tuple satisfying the condition cdn(x\, x n ), the function rtn(xi, • • • , x n ) 
produces a tree, called an e-tree, under the root node of the view. That is, V 
maps a tuple to an e-tree. The children of the e-tree are the 7-trees selected by 
all the expressions Xi/^s (for all i) from the tuple. A tuple is mapped to one 
and only one e-tree and an e-trce is for one and only one tuple. A 7-tree of a 
tuple is uniquely mapped to a child of the e-tree of the tuple and a child of an 
e-tree is for one and only one 7-tree of its tuple. 

The path of a node s in the view has the following format: 

v/z/Li/Oi (2) 
C t = L(xi/ji) (3) 

where Xj/7j is an expression in rtn(xi, x n ), L(xi/ji) returns the last 
element name of the path Xt/jt, and 0j is a path following Ci in the view. 
When Ci/9i is not empty, the path in the source document corresponding to 
v/t/d/di is 

Xihi/6i 

The view definition has some properties important to view update transla- 
tion. Firstly because of context-based production, a binding of variable Xi may 
be copied into xf~\ ■ ■ • , x\ m ^ to appear in multiple tuples: 

(1) 

<---,x] ,Xj[i],---> 

(m) 

< ' ' ' ) %i 7 ' ' ' ; ^j[mj] ) ' ' ' > 

where Xj[i], • • • , Xj[ mj ] are different bindings of Xj. Each tuple satisfying the 
condition cdn{x\, ■ ■ ■ , x n ) is used to build an e-tree. As a result of Xi being 
copied, the subtrees of x, will be copied accordingly to appear in multiple e- 
trees in the view. 

Secondly, a tree may have zero or many sub trees located by a given path 
p. That is, given a tree bound to Xj, the path expression Xi/p may locate zero 
or many sub trees Tf«/ p ; . . . ^Tn]j v in Xj. This is true in the source documents 
and in the view. 
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Thirdly, two path expressions 2^/7$ and Xj/jj generally may have the same 
last element name, i.e., L(xi/ji) — L(xj/jj). For example, if Xi represents an 
employee while Xj represents a department, then Xi/name and Xj/name will 
present two types of names in the same e-tree. This make the semantics of the 
view data not clear. This is the reason that we assume that all L(xi/"fi)s are 
distinct. 

Example 1. Consider the view definition below and the source document shown 
in Figure [lja) . The view instance is shown in Figure |ljb) . 

<v>{for x in doc("r")/r/A, y in x/C, z in x/H 
where y/D=z and z="l" 
return <e>{x/B}{x/C}{y/F/G}{z}</e> 
}</v> 




Figure 1: Source document r and view v 

From the view definition, 71 = B, 72 = C, 73 = F/G, and 74 = <p. L(x/^/i) = 
d = B, L{x/ l2 ) = C 2 = C, L(y/ l3 ) = £3 = G, and L(z/ l4 ) =L 4 = H. 

Formula Q is exemplified as the following. The node V3 in the view has the 
path v/e/C/F/G where C is C 2 = L(x/j2) and F/G is 0. The node v\ is an e 
node and its path is v/e and Hi/Oi is (f>. 
The example shows the following. 
. The expression x/B {=x/ r y\) of the return clause has no tree in the e- 
trees. 

. The path expression x/C (=22/72) has multiple trees in an e-tree. 

. The trees of x/C are duplicated in the view and so are their sub trees. 

. Each of some x/C trees has more than one x/C/F (=x/j2/6) sub trees. 

2.2 The update language 

The update language we use follows the proposal [5] extended from XQuery. 

Definition 5 (SV). A view update statement has the format of 

for ii in pi, • • •, x u in p u 
where x c /p c — strValu 

update Xt/pt C delete T I insert T ) 
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where x c , Xt <E [x~i, ■ ■ ■ , x u ], pi, ■ ■ ■ ,p u are paths (Definition [2| proceeded by v 
or Xi] pcpt are paths; all element names in the paths are elements names in the 
view. x c /pc and x~t/Pt are called the (update) condition path and (update) 
target path respectively. □ 

The next process builds the mapping represented by Formula ([3]). 

Procedure 1 (mapping). When the variables in x c /p c and Xt/pt are replaced 
by their paths in the /or-clause until the first element name becomes v, the 
full paths of x c /p c and Xt/pt will have the format of v/t/£ c /9 c and v/t/C t /9t 
as shown in Formula The element names C c and Ct, if C C /9 C and At/Ot 
are not empty, must be the last element names of two expressions x c /j c and 
x t /^it in the return clause of the view definition V. A search using C c and Ct 
in V will identify the expressions. Consequently v/t/C c /9 c and v/t/Ct/Ot are 
mapped to x c /~/ c /9 c and x t / r y t /9t respectively. □ 

With this mapping, the update statement 8V can be represented by the 
following abstract form: 

(p.; v/t/C c /6 c = strValu; v/t/C t /9 t ; del(T)\ins{T)) (4) 

where 

. v/e/C c /9 c is the full update condition path (int the view) for x c /p c , 
v/t/Ct/Ot the full target path for Xt/Pt] 

. p s is the maximal common front part of v/e/C c /9 c and v/e/Ct/O t . 

The semantics of an update statement is that under a context node identified 
by p s , if a sub tree identified by v/z/C c /9 c satisfies the update condition, all 
the sub trees identified by v/t/£ t /9 t will be applied the update action (del(T) 
or ins(T)). The sub tree T v ^/ C "/ 9 " is called the condition tree of T v ^/ C ^ 9t . 
A sub tree is updated only if it has a condition tree and the condition tree 
satisfies the update condition. An update target and its condition trees are 
always within a tuple when the view definition is evaluated and are in an e-tree 
in the view after the evaluation. 

We note that because of the context-based production in the update lan- 
guage, the same update action may be applied to a target node for multiple 
times. For example, if x is binding and the context-based production produces 
two tuple for it < x^\ ■ ■ ■ > and < x^ 2 \ ■ ■ ■ >. If the update condition and 
target are all in x, x will be updated twice with the same action. We assume 
that only the effect of the first application is taken and the effect of all other 
applications are ignored. 

Based on the structure of the target path tp — v/z/£ t /9 t , updates may 
happen to different types of nodes in the view. 

. When C t /9 t ^ </>, the update happens to the nodes within a 7-tree. 

. When tp — v/t, the update will add or delete a 7-tree. 

. When tp — v (in this case, p s = v), the update will add or delete an e-tree. 
We will present the first case in Sections [3] and [4] and present the last two 
cases in Section [5l 
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2.3 The view update problem 

Definition 6 (Precise Translation). Let V be a view definition and S be the 
source of V. Let SV be an update statement to V. Let SS be the update 
statement to S translated from SV. SS is a precise translation of SV if, for any 
instance S l of S and V i = V(S l ), 

(1) SS is correct. That is, V{SS{S 1 )) == SV^) is true; and 

(2) SS is minimal. That is, there does not exist another translation SS' 
such that (SS' is correct, i.e., V(SS'(S 1 )) = V(<y5(5*)) - 6V(V i ) and 
there exists a tree T in S 14 and T is updated by SS but not SS'). □ 

We note that Condition (1) also means that the update SS will not cause 
view-side-effect. Otherwise, V(5S(S 1 )) would contain more, less, or different 
updated trees than those in 8V(V). 

Definition 7 (the view update problem). Given a view V and a view update 
SV, the problem of view update is to (1) develop a translation process P, and 
show that the source update SS obtained from P is precise, or (2) prove that a 
precise translation of SV does not exist. □ 

3 Update Translation when C t /9 t ^ (f) and x c = x t 

In this section, we investigate update translation when the update is to change 
a 7-tree of the view and the mappings of the update condition path and the 
target path refer to the same variable. We present Algorithm [T] for view update 
translation in this case. The algorithm is self-explainable. 



Algorithm 1: A translation algorithm 



Input: view definition V, view update SV 
Output: translated source update SS 
l begin 

make a copy of V and reference the copy by SS ; 
remove rtn() from SS ; 

from the view update SV, following Procedure [lj find mappings 
Xc/lc/lc and Xt/jt/jt for the condition path x c /p c and the target 
path x t /pt ; 

make a copy of SV and reference the copy by SV C ; 
in SV C , replace x c /p c and x t /p t by x c /j c /-f c and x t /"f t /jt respectively 

append the condition in the where clause of SV C to the end of the 
where clause in SS using logic and ; 

append the update clause of SV C after the where clause of SS 



8 



By the algorithm, the following source update is derived. 

SS: for i! in ft, x n in p n (5) 

where cdn(xi, ■ ■ ■ , x n ) and x c /j c /O c — strValu 
update Xt/jt/&t (insert T I delete T) 

We now develop the preciseness of the translation. We recall notation that 
fortup(V) means the tuples of the context-based production (Definition |4| of 
V. xi 1 ^ and x c 2 ^ are two copies of a binding of x c , and x c , x c w and x c \2] are 
three separate bindings of x c . 

Lemma 1. Given a tuple t —< Xt,x c , ■ ■ ■ >G fortup(V) and its t-tree e, (1) 
if T is a tree for the path x t /"ft/0t in t and T is updated by SS, then all the 
trees identified by xt/jt/Ot in t are updated by SS, and all the trees identified by 
Ct/Ot in e are updated by SV . (2) ifT is a tree for the path Ct/Ot in e and T is 
updated by 5V , then all the trees identified by x t /"ft/&t in t are updated by SS, 
and all the trees identified by Ct/Ot in e are updated by SV. 

The lemma is correct because of the one-to-one correspondences between a 
tuple and an e-tree and between t's 7-trees and e's children, and because all the 
trees identified by x t /"/t/9t m t share the same condition tree(ies) identified by 
Xc/lc/dc in %c of t, and all the trees identified by Ct/Ot in e share the same 
condition tree(ies) identified by C C /0 C in e. 

Lemma 2. Given a tuple t =< x t ,x c , ■ ■ ■ >€ fortup(V), let a subtree T Xt ' lt ' et 
of Xt be updated by SS and become t' =< x' t ,x c , • • ■ >. Ifxt/jt/Ot is not a prefix 
of any of the path in the where clause of SS, if t satisfies cdnQ of V, t! also 
satisfies cdnQ ofV. 

The lemma is correct because the subtrees in the tuple used to test cdnQ 
are not changed by SS when the condition of the lemma is met. 

Lemma 3. Given a tuple t =< Xt,x c ,--- >e fortup(V) and its e-tree e, if 
the T Xc / lc / 0c in t satisfies x c /j c /8 c — strValu, T c "/ 8c in e satisfies C C /9 C — 
strValu and vice versa. 

The correctness of the lemma is guaranteed by the one-to-one correspondence 
between i's 7-trees and e's children. 

Lemma 4. Given a tuple t =< x t , x c , ■ ■ ■ >£ fortup(V) and its e-tree e, let T 
be a tree identified by Xt/jt/Qt in t and T' be the corresponding tree identified 
by Ct/Ot in e. Obviously T = T' . As SS and SV have the same update action, 
if x c satisfies the update condition, SS(T) = SV(T'). 

Theorem 1. Update SS is a precise translation of the view update SV if (i) 
Ct/Ot ^ (f> and x c = x t , and (ii) Xt/jt/Ot does not proceed any path in the 
where clause ofSS. 
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Proof. We follow Definition [6] Without losing generality, we assume that 
Xf, — X q — X 1 . Figure [2] illustrates the relationship between a variable binding 
X\ in the tuple < x%,--- > and the e-tree built from the tuple. The 7-trees 
j>ai/7* a nd j^iAvc m become the children of e in the view. T 1 '/ 7 '/ 9 ' and 
2 i a:i/7 c /6» care an update target tree and a condition tree respectively. T^i/t*/ 9 * ' s 
children will be deleted or a new child will be inserted. 



(a) <x t ,x c ,-> 
x, x c 




It = x,/y, 
k = x,/y c 



Figure 2: Each of tuples is mapped to an e-trcc 



(1) Correctness: V{SS{S 1 )) = SV(V(S 1 )) 

Consider two tuples ti =< , • • • > and t 2 —< • • • > in the evaluation 
of SS where x^p and x± are copies of x\. Obviously if is updated, x^ 
is updated too. That is, their source x\ will be updated twice although only 
the first is effective. As SS and V have the same for clause, t\ and t 2 exist in 
fortup(V). Assume e% and e 2 are mapped from t\ and t 2 respectively by V. 
Then, either both e± and e 2 are updated or none is updated. 

D: Let T Ct > 9t be a tree in an e-tree e of ViS*) updated to f Ct / 8t by SV (e 
becomes e' after the update). We show that T Ct / dt is in e' of V(5S(S 1 )). In 
fact, that T Ct / 6t is in V(S Z ) means that there exists one and only one tuple t = 
<Xi, • • • > in fortupiy) satisfying cdn{), that in the tuple, Xi/"/ t /0 identifies the 
source tree T Xl ^^ 9t of T Ct / 6t . T Ct / 6t being updated by SV means that there 
exists a condition tree T Cc ^ 8c in e and the condition tree satisfies v/t/C c /6 c — 
strValu. 

On the other side, because V and SS have the same for clause, t is in 
fortup(SS). Because T c -I 6 " makes v/c/C c /9 c = strValu true, so T x ^hc/e c 
makes x\/^ c /9 c = strVal true (Lemma l3|) . This means 

T xi/it/e t is updated by 

SS and becomes T xi ht/ e t . Thus t becomes t! =< xi, • • • >. Because of Lemma 
g fx 1 /~ ft /e t ^fx 1 /~ /t /e t _ Because f (y) fjhe theorem and Lemma^ i' satisfies 
cdn() and generalizes e' in the view. So T Ct / 0t is in V^(<55(5' 1 )). 

C: Let jf t/9< and T 2 t/9t be two trees in V{5S{S 1 )) and their source tree(s) 
are updated by SS. We show that lf t/9t and T 2 £t/f,t are in 5y(V(5*)). There 
are three cases: (a) T^*' * and T 2 ^ et share the same source tree T 11 / 7 " 19 ' (they 
must appear in different e-trees in the view), and (b) T 1 tl 1 and T 2 * have 
different source trees y^ 1 / 7 '/"* and y^ 1 / 7 */ 9 '. Case (b) has two sub cases: (b.l) 
y^*/ ' and yf*/ 9 ' appear in the same e-tree in the view, and (b.2) y^*/ * and 
y^*/ 9 ' appear in different e-trees. 

Case (a): That y x i/7t/0t is updated by 55* means that there exist two tu- 
ples <ar j , ■ • • > and <£j 2 \ ■ ■ • > in fortup(SS) such that 2^ = afj , both tu- 
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pies satisfy cdn(), and there exists condition tree T^i/W^e j n eacn tuple sat- 
isfying x c h c /e c = strValu, T 11 ''"/ 8 ' is updated to by SS (two up- 
date attempts with the same action for the two tuples, only the effect of the 
first attempt is taken). After the update, the tuples become t\ = <3ii, • • • > 
and <2 — <X\ \ ■ ■ ■>■ By Lemma [2J t[ and t 2 satisfy cdn of V and produce 
ei,e 2 € 1^(55(5*)) and jf* /9t € d and T 2 £f/fft g e 2 . 

On the other side, when V is evaluated against S l , x\ is copied to two 
tuples t\ — <Xi , • • • > and t 2 = <x^\ • • • > in fortup{V) and each of the tuples 
satisfies cdn(). They produce e-trees e[ and e' 2 . Because each tuple has a 
condition tree T Xl / lc / e " satisfying x c /j c /9 c = strValu, by Lemma pi each of 
e'j and e' 2 has T Cc ^ 6a satisfying C C /9 C — strValu and each has a T c *T 0t . Thus 
Tf t/et e ei and T 2 £t/et e e 2 will be updated to lf t/Bt and T 2 £t/e ' by <5V. ei 
and e 2 become ei and e 2 in (JV^V^S 18 )). 

Case (b.l): That T* lht/f>t and T 2 lht/6t are updated by <5S and that they 
appear in different e-trees mean that there are two tuples <2?im, • • • > and 
<Xi[2], • ■ • > where :rx[i] and x^ are different bindings of x\, T® 1 'i t ' St g xim, 
rpx\ht/6t g x c [2], and each of tuples satisfies cdn() and x c /^ c /9 c = strValu. 
T Xlht/6t and T 2 lht/8t become ff 77 ' 79 * an d Xf /7t/f,t after the update and 
mapped to Tf t/9t and T 2 £t/9t in two different e-trees of V(5S(S 1 )). Following 
the same argument of Case (a), ff t/6t and f 2 t/6t are in SV(V(S 1 )). 

Case (b.2): That T* lht/9t and T 2 lht,0t are updated by SS and that they 

appear in a single e-tree mean that there is one and only one tuple <x\, ■ ■ ■ > 

where t* 1 ^^ 6 * , j^ 1 / 7 '/"* g Xl . The tuple satisfies cdn{) and there is a tree 

j^i/ic/flc in the tuple satisfying x 1 /^ c /6 c = strValu. T* lht/9t and T 2 lht/f>t 

become Tf 177 * 7 ^ and f 2 lht/6t after the update and mapped to ff t/9t and 

T 2 £t/9t in a single e-tree of V(SS(S' 1 )). On the other side, as T* lht/9t and 

T* lht/6t are mapped to a single e-tree e and share the same condition tree 
T x l / lc /e % T c t /e t and T c t /e t ghare ^ game condition tree T £ c /e c in e and wiU 

be updated by SV. So f-f t/9t and f 2 t/0t are in the e-tree of SV^iS*)). 
(2) (55* is minimal 

We prove by contrapositive. Let T Ct / St be a tree in the view updated by SV. 
Then from above proofs, T Xl ^ lt / et is updated by SS and there exists a tuple 
<x±, ■ ■ ■ > such that T Xl ^ lt l St is in x\ and xi has a condition tree T^i/W^c 
satisfying "cdnQ and x\/^ c /6 c — strValu" . 

If T^i/Tt/e* i s no t updated by either (a) xi is not a variable in the for- 
clause of SS' , i.e., x\ is not in any tuple and neither is T 11 ' 7 "' 9 ', or (b) x\ is in 
the tuple <X\, • • • > but T Xl ^ t ^ 0t is not in xi, or (c) X\ is in the tuple <X\, ■ ■ ■ > 
and T 11 ' 7 ' ' et is in but one of ll cdn()" and "x c /j c /8 c — strValu" is not in 
<J5'. 

In Case (a), because x\ is not a variable in SS' , so T Xl l lt l 6t will not be up- 
dated by (5S" (this does not prevent T 1 '/ 7 '/*' from appearing in the view). This 
means that the T £t/et in V^S"^)) is different from the T £t7e * in SV(V(S 1 )) 
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Figure 3: Books and their references 



because the assumption assumes that the T Ct ^ 6t in SV(V(S 1 )) is updated. This 
contradicts the correctness of 8S' . 

In Case (b), because T Xl /^/ 8t is not in x u so T 11 '' 7 "' 8 ' is not in V(S % ). This 
contradicts the assumption that T Ct / 9t is in the view. 

In Case (c), if cdn() is violated, the tuple of T^i/W^t w [\\ n0 ^ ^ e selected by 
V, so J^i/t*/ 9 * i s no t i n V(S l ) which contradicts the assumption. If xi/~/ c /6 c — 
strValu is violated, T Xl ^ lt l 6t will not be updated by SV. This contradicts the 
assumption that T Ct / 6t is updated by SV. 

This concludes that SS is a precise translation. 

□ 

We note that the theorem gives only a necessary condition but not a sufficient 
condition. The reason is that there exists other cases where a view update is 
translatable. These will be further presented in the following sections. 

We use an example to show how a view update is translated using the results. 
Figure [3] shows two XML documents. Document (a) stores book information 
where auths and aName mean authors and author-name elements respectively. 
Document (b) stores university subject, textbook and professor information 
where uName, subjs, sName, profs, and pName mean university-name, sub- 
jects, subject-name, professors, and professor-name respectively. 

The view Qbk is defined below to contain, for each use of a book by a 
university subject, the author names and the title of the book, the name of the 
university and the professors using the book in their teaching. 

<Qbk>{ for x in doc ("bklnf . xml") /bklnf /book, 

y in doc ("subj Inf .xml") /subj Inf/uni , 
z in y/subjs/subj 
where x/title=z/title 

return <use>{x/ auths}{x/title}{y/uName}{z/prof s}</use> 

}</Qbk> 

The view instance for the XML documents is shown in Figure |4j 
Now assume that the user of the view wants to add author Susan to the 
textbook IS in the view using the update statement below. 
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Figure 4: author-books and universities using them 



for r in view(Qbk) /Qbk/use 
where r/title="IS" 

update r/auths { insert <aName>Susan</aName>} 



With this statement, the user expects that next time when the view is se- 
lected, the output is Figure[5ja) where trees Vb, v c and Vd are the same as those 
of Figure [4] and tree v e contains the newly added author Susan. 



jObk Va.Qbk 



v^.bklnf 



v b :use v c :use v d :use v e :use 



ulnf 



(a) 



\ 
II 



■.book v 3 :book v 4 :book v 2 ,:book 



(b) 



Figure 5: An insertion update 

In the update statement, the update condition path and the update tar- 
get path are r /title and r/auths. The full view paths of the two paths are: 
Qbk / use/ title and Qbk/use/auths. In the paths, Qbk is v of Formula ([2]), use 
is c, title is £ c , auths is £ t , and 9 C and t are (j). Following Procedure [I] by using 
title and auths, we find the expressions x /title and x/ auths. By Algorithm [lj 
the following source update is derived: 

for x in docC'bklnf .xml")/bklnf /book, 

y in doc ( "subj Inf .xml") /subjlnf /uni , 

z in $y/subjs/subj 
where x/title=z/title and x/title="IS" 
update x/auths { insert <aName>Susan</aName>} 
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When this statement is executed against Figure |3ja), the document becomes 
Figure [5jb) where trees i>2 , v$ and V4 are the same as those in Figure [3ja) and 
t>2i is changed. The view instance will appear as expected by the user when 
selected next time. 

4 Update Translation when C t /9 t ^ (j) an d x c ^ x t 

We look into the translation problem when the mappings of the update condition 
path and the update target path are led by different variables. The results of 
this section generalize the view update problem in the relational views when 
they are defined with the join operator. 

In general, view updates are not translatable in the case of x c ^ xt- 
Consider two tuples where the binding x t is copied to x\ and x[ to combine 
with two bindings x c m and x c [ 2 ] of x c by the context-based production as 

(1) 

<•••,£< r--,%c[i],---> 

(2) 

< ' " " ) x t j ' ' ' j x c[2] 1 ' ' ' > 

Assume that in the view, the update condition x c /^ c /8 c is satisfied in £ c m by 
violated in x c p] • Then, the copy of xt corresponding to the first tuple will be 
updated but the one to the second tuple will not. In the source, if Xt is updated, 
not only the first copy of x t changes, but also the second copy. In other words, 
the translated source update has view side-effect. However, if x t in the source 
is not updated, all its copies in the view will not be changed. 

Although generally view updates, when x c 7^ x t , are not translatable, for 
the following view update, a precise translation exists. 

V: <v>{ for ii in ft, x n in p n (g) 

where ••• and x c /j c /9 c = x c+ i/j c+ i/6 c+ i and ••• 
return rtn(x\, ■ ■ ■ , x n ) }</v> 

where x c /~f c is in rtn(xi, ■ • • ,x n ), i.e., x c /~/ c /9 c is exposed in the view. 
SV: 

(p s , v/t/£ c /9 c = strValu, v/t/L t /6 u del{T)\ins(T)) (7) 
where x t is either x c or x c+ \. 

The condition requires that, in the view definition, x c /j c must be a front 
part of one of the join path x c /^ c /9 c . At the same time, the path in view 
mapped from x c /"f c /8 c must be the update condition path. Furthermore, the 
mapping of the update target path must be led by the same variable x c leading 
the update condition path or by the variable x c +\ that joins x c in the view 
definition. 

Consider Example [T] With the condition, y/D = z and z — "1", in the 
where clause, for a view update to be translatable, the mapping x c /^/ c /9 c of 
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the view update condition path must be z or y/D, and the mapping xt/jt/Ot 
of the view update target path must be ended with F, G or E. We note that if 
XthtjQt is ended with C or H, then x t /jt/0t is a prefix of one of the paths in 
the join condition and the update will not be translatable. 

Theorem 2. Given the view V and a view update SV defined above, update SS 
of Formula ^ is a precise translation of the view update SV if (i) C t /0 t ^= 4>, 
and (ii) x c /jt/&t does not proceed any path in the where clause of SS. 

Proof. The notation of this proof follows that of the proof for Theorem [T] and 
Figure [2J Consider two tuples t\ = <X^ \ x c [i], ■ ■ ■ > and t2 = <x[ 2 \x c [2], •••> 
in the evaluation of SS(S) where x[^ and x[ 2 ^ are copies of xt and x c m and 
x c [2] can be the same. If one is updated by SS, the other is updated too. The 

reason is that for T* tht/0t € x{ 1] and T* tht/9t G xf\ because of the join 
condition in Formula [6] z c /7 c /6> c = ie c +i/7c+i/#c+i and because x c +\ — x t and 
x[ 1] = xf \ a condition tree Tf c/7c/Sc exists for if ht/9t and T^ hJ8c exists 
for T* tht/et and T^ hJe < = T^ h " /9 \ Consequently if T^ hc,9c satisfies the 
update condition, so does T^ h '" /f>a . So either both T* t/lt/8t and T^ tht,0t are 
updated or none is updated. Following Lemma|4j if e\ and e-i are mapped from 
rpXt/it/Ot anc | rpXt/jt/Ot reS p ec tively, if one is updated, the other is updated too. 

The remaining proof can be completed by following the argument of the 
proof of Theorem [l] □ 



5 Update Translation when C t /9 t = (J) 

In this section, we identify translatable cases where £t/0t — 4>i that is, the 
update target path is v or v / 'e. In the case of u, the update itself is an addition 
or a removal of an e-tree. In the case of i>/e, the update is an insertion or a 
deletion of a 7-tree. 

Obviously if the user does not know the structure of the view, wrong subtrees 
can be added. As an example, consider Qi in Figure [6] The path Qi/E allows 
child elements labeled with C. If the user adds a sub tvev labeled with F under 
v u , the update violates the view definition. We exclude this type of cases and 
assume that the user knows the structure of the view and the updates aim to 
maintain such a structure. 

In general, insertion updates are not translatable when C t /9 t = 4>- A number 
of reasons exist for this. The first is that there is no unique way to apply 
insertions to the source documents in many cases. The second reason is that 
the updates violate the context-based production. The third reason is there is no 
way for the user to write an update statement with a specific enough condition 
to update the view while the context-based production is not violated. We use 
three examples to illustrate the reasons. 

Example 2. Consider Q 1 in Figure[6j If another subtree (E (C (W : 2)(G : 8))) 
is inserted to Qi, in the source the subtree (C (W : 2)(G : 8)) needs to be 
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<Q 1 >{for x in doc(r)/r/A/C 
where x/G=1 
return <E>{x}</E> 
}</Qi> 



WG WG 
6)52 



v u :^v):E E 



v p :C v„:C 



W 

5 



WG 
6 1 



5 2 



<Q2>{ for x in r/A, y in /r/B, Z in y/D 
where X/C/G=y/F and x/C/G=1 
return <E: 




\A/G W G 
6 I 5 2 



Figure 6: Two views to show updates to E and to Q 



inserted to r. We cannot find a unique way to do so as the subtree can be 
inserted to an existing A element or a new A element is created and the subtree 
is inserted under the new A element. 



Example 3. Consider Qi in Figure [6] again. If an update is an insertion of 
(C (W : 2)(G : 8)) under v u , the context-based production is violated. By the 
context-based production, if x in the return clause is not followed by any path 
expression, only one C element is allowed in each E tree. 



Example 4. Consider Q 2 in Figure [6] where C elements are selected by x/C in 
the return clause. If the user wants to insert another C element under both v a 
and v s (but not the other E elements) such that the context-based production 
is satisfied, the user has no way to specify an accurate condition for this because 
the node identifiers, v a and v s , are not available to the user. 

For the same reasons, many deletion updates are not translatable. However 
in the case where all the expressions in the return clause start with the same 
variable, deletion updates to such views are translatable. We show the details 
below. 

Let the view definition be 

V: <v>{ for xx in pi, x n in p n 

where cdn(xi, ■ ■ ■ ,x n ) ^ ' 

return rtn(x\) }</v> 

In the view, only the variable x\ is involved in the return clause. Let the 
update statement to the view be 

6V : for e in v/t 

where e/C c /9 c = aVal ^ ' 

update e (delete £ f ) 

The translated source update is 



(10) 



1G 



SS: for X\ in pi, x n in p n 

where cdn{%\ , • • • , x n ) and xi/"f c /9 c = aVal 
update Xi/'jt/.. (delete £ t ) 

In the formulae, £ t is the last element of Xi/jt. To allow a Ct node to be inserted 
to or deleted from the source document, the target path must be X\/^ft/.. . 

Theorem 3. Given the view definition V , the source update SS is a precise 
translation of the view update SV if Xi/jt/.. does not proceed any of the paths 
in the where clause of 5S. 

Proof: We follow Definition § to prove V(5S(S 1 )) = SViViS*)) and omit 
the proof that SS is minimal. We note that £ t ^ C c implies Xi/j c ^ X\/^ t - 

C: Assume that e[ and e' 2 are two c-trees in V(SS(S 1 )). Then there exists 
two tuples t[ =< Xi , ••• > and t 2 —< x^\--- > for e[ and e' 2 and they 
satisfy cdn() of V. That the two tuples are updated by SS means that they 
are the results of updating two tuples t\ —< x[\ ■ ■ ■ > and t 2 =< x^, ■ ■ ■ > 
by 6SQ and t\ and t 2 satisfy cdnQ and have condition trees j , ^ 1 / 7< =/ e <= an d 
j.^i/7c/fc satisfying x-l/j c /6 c = aVal, and the update deletes trees like T Xl l' H . 
Consequently T Ct s are not in e[ and e' 2 . 

On the other side, as t\ and ti satisfies cdn(), they produce e± and in 
V(S l ). At the same time, e\ and e 2 have condition trees T^' 6 " and T 2 °^ Sc 
satisfying C C /0 C = aVal (Lemma [3|, they are updated as T £t s will be deleted 
from from them. So they become e' x and e' 2 and are in SV(V(S 1 )). 

D: Let e[ and e' 2 be e-trees in 5V(V(S 1 )). Then there exist e\ and e2 in 
V(S l ) and 8V deletes T Ct s from them. That is, e\ and have condition trees 
satisfying cdnQ and C C /9 C — aVal. e\ and e2 are for two tuples t\ —< x^\ ■ ■ ■ > 

(2) 

and t 2 =< x\ , • • • > in V and the two tuples satisfy cdn(). 

On the other side, t\ and t 2 satisfy cdnQ and xi/j c /O c = aVal (Lemma 
[3]), they will be updated and T Ct s will be deleted from them. So because of 
Lemma [2] they become t[ =< ■ ■ ■ > and t 2 =< xf\ ■ ■ ■ >. When (5S(S' i ) 
is evaluated against V , t! x and t' 2 produces and e' 2 which do not contain any 
T £t s. So they are in V(6S(S 1 )) 

SS is minimal: If a tree is not relevant to the view, the tree does not satisfy 
cdn(xi, • • • , x n ) and it will not be updated by SS. □ 

For the same view definition V in (5), if the update is applied to the root 
node as the following, 
SV : for u in v, 

where u/t/ C C /9 C = aVal (11) 

update u (delete e) 

the translated soruce update is 

SS: for x\ in p\, x n in p n 

where odn{xi , • • • , x n ) and xi/j c /9 c = aVal 

(12) 

update x\j '.. (delete L(x±)) 
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We note that when an e node is deleted, deleting all the 7 trees from their 
parent nodes in the source document is not enough. The binding of the variable 
must be deleted. 

Theorem 4. Given view definition V in Formula the source update SS in 



Formula (12) is a precise translation of the view update SV in Formula (11) 



proof: Let t\ =< , • • ■ > and t% =< X\\ ■ • • > be two tuples in 
fortup(V), x± and x± be two copies of x\ in the source, e\ and e 2 be two 
e-trees for the tuples in V(S l ), and e\ and e 2 are deleted by SV. Because e\ and 
€2 are in V(S l ), t\ and £2 satisfy cdn(). e\ and €2 being deleted by SV means 
that each of them has a subtree T £c / 6 ' c satisfying C C /9 C = aVal. By Lemma 
[5J each of ii and t 2 has a tree T Xl ^ a ^ 8c satisfying X\j^ c jd c = aVal. Thus t\ 
and t2 will be updated by SS meaning the binding of x\ will be deleted from 
the source. Consequently t\ and t 2 will not be in fortup(V(SS()) and e\ and 
e 2 will not be in V(SSQ). 

The proof that SS is minimal is similar to that of Theorem [T] □ 

6 Conclusion 

In this paper, we defined the view update problem in XML and shown the 
factors determining the translation problem. We identified the cases where view 
updates are translatable, shown a translation algorithm, gave the translated 
source updates, and proved the source updates are precise. 

The translatability of view updates is information dependent. In this paper, 
we assume the only information available is the view definition and the update. 
When other information like keys and references are used in the translation, 
different algorithms and different source updates may be obtained. We leave 
the investigation of these problems as future work. 
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